√在线天堂中文最新版网,97se亚洲综合色区,国产成人av免费网址,国产成人av在线影院无毒,成人做爰100部片

×

reinforcement learning algorithm造句

"reinforcement learning algorithm"是什么意思   

例句與造句

  1. Risk - sensitive reinforcement learning algorithms with generalized average criterion
    風(fēng)險(xiǎn)敏感度激勵(lì)學(xué)習(xí)的廣義平均算法
  2. A reinforcement learning algorithm based on process reward and prioritized sweeping is presented as interference solving strategy
    本文提出了基于過程獎(jiǎng)賞和優(yōu)先掃除的強(qiáng)化學(xué)習(xí)算法作為多機(jī)器人系統(tǒng)的沖突消解策略。
  3. ( 4 ) a new cooperation model called macm is presentd and based on this model , an improved distributed reinforcement learning algorithm is also proposed
    ( 4 )提出一種新的多agent協(xié)作模型macm及一種改進(jìn)的分布式強(qiáng)化學(xué)習(xí)算法。
  4. In the first chapter of this paper , a comprehensive survey on the research of reinforcement learning algorithms , theory and applications is provided . the recent developments and future directions for mobile robot navigation are also discussed
    本文的第一章對(duì)增強(qiáng)學(xué)習(xí)理論、算法和應(yīng)用研究的發(fā)展情況進(jìn)行了全面深入的綜述評(píng)論,同時(shí)分析了移動(dòng)機(jī)器人導(dǎo)航控制的研究現(xiàn)狀和發(fā)展趨勢(shì)。
  5. Reinforcement learning has been applied to single agent environment successfully . due to the theoretical limitation that it assumes that an environment is markovian , traditional reinforcement learning algorithms cannot be applied directly to multi - agent system
    由于強(qiáng)化學(xué)習(xí)理論的限制,在多智能體系統(tǒng)中馬爾科夫過程模型不再適用,因此不能把強(qiáng)化學(xué)習(xí)直接用于多智能體的協(xié)作學(xué)習(xí)問題。
  6. It's difficult to find reinforcement learning algorithm in a sentence. 用reinforcement learning algorithm造句挺難的
  7. In this paper , introducing joint - action to the traditional reinforcement learning , a new multi - agent reinforcement learning algorithm based on behavior prediction is presented and several methods for predicting other agents " behaviors are discussed
    在傳統(tǒng)強(qiáng)化學(xué)習(xí)方式中引入組合動(dòng)作的基礎(chǔ)上,本文提出了一種基于行為預(yù)測(cè)的多智能體強(qiáng)化學(xué)習(xí)方法,研究了對(duì)其他智能體行為進(jìn)行預(yù)測(cè)的幾種可行方法。
  8. The reinforcement learning algorithm was also introduced , since it has some relations with the colony algorithm and can be need in the problem of scheduling . 4 . some new concepts and scheduling algorithms for batch chemical process were proposed in our studies
    由于蟻群算法與人工智能中的強(qiáng)化學(xué)習(xí)算法之間有著某種聯(lián)系,同時(shí)強(qiáng)化學(xué)習(xí)近年來也應(yīng)用于求解調(diào)度問題,因此本文也涉及到了一些強(qiáng)化學(xué)習(xí)的主要算法。
  9. Reinforcement learning algorithms that use cerebellar model articulation controller ( cmac ) are studied to estimate the optimal value function of markov decision processes ( mdps ) with continuous states and discrete actions . the state discretization for mdps using sarsa - learning algorithms based on cmac networks and direct gradient rules is analyzed . two new coding methods for cmac neural networks are proposed so that the learning efficiency of cmac - based direct gradient learning algorithms can be improved
    在求解離散行為空間markov決策過程( mdp )最優(yōu)策略的增強(qiáng)學(xué)習(xí)算法研究方面,研究了小腦模型關(guān)節(jié)控制器( cmac )在mdp行為值函數(shù)逼近中的應(yīng)用,分析了基于cmac的直接梯度算法對(duì)mdp狀態(tài)空間離散化的特點(diǎn),研究了兩種改進(jìn)的cmac編碼結(jié)構(gòu),即:非鄰接重疊編碼和變尺度編碼,以提高直接梯度學(xué)習(xí)算法的收斂速度和泛化性能。
  10. By means of the proposed reinforcement learning algorithm and modified genetic algorithm , neural network controller whose weights are optimized could generate time series small perturbation signals to convert chaotic oscillations of chaotic systems into desired regular ones . the computer simulations on controlling henon map and logistic chaotic system have demonstrated the capacity of the presented strategy by suppressing lower periodic orbits such as period - 1 and period - 2 . meanwhile , the periodic control methodology is utilized , the higher periods such as period - 4 can also be successfully directed to expected periodic orbits
    該控制方法無需了解系統(tǒng)的動(dòng)態(tài)特性和精確的數(shù)學(xué)模型,也不需監(jiān)督學(xué)習(xí)所要求的訓(xùn)練數(shù)據(jù),通過增強(qiáng)學(xué)習(xí)訓(xùn)練方式,采用改進(jìn)遺傳算法優(yōu)化神經(jīng)網(wǎng)絡(luò)權(quán)系數(shù),使之成為混沌控制器,便可產(chǎn)生控制混沌系統(tǒng)的時(shí)間序列小擾動(dòng)信號(hào),仿真實(shí)驗(yàn)結(jié)果表明它不僅能有效鎮(zhèn)定混沌周期1 、 2等低周期軌道,而且在周期控制技術(shù)基礎(chǔ)上,也可成功將高周期混沌軌道(如周期4軌道)變成期望周期行為。
  11. L3ased on the organization rules of internet data , the distribution laws of hyperlinks and the name rules of url , a algorithm of tvm rebuilding is established , and satisfactory experiment results are obtained by applying this algorithm . furthermore , efforts are made by applying of tvm on browse navigation , web page classification and reinforcement learning algorithm
    結(jié)合互聯(lián)網(wǎng)資源的構(gòu)建規(guī)則、鏈接分布規(guī)律和url命名規(guī)則,論文提出了樹藤共生數(shù)據(jù)模型的重建算法,實(shí)驗(yàn)結(jié)果驗(yàn)證了樹藤共生模型的有效性與合理性,在此基礎(chǔ)上初步討論了樹藤共生模型在瀏覽導(dǎo)航、網(wǎng)頁分類和reinforcementlearning算法中的應(yīng)用。

相鄰詞匯

  1. "reinforcement fibre"造句
  2. "reinforcement fixing"造句
  3. "reinforcement frame"造句
  4. "reinforcement joint"造句
  5. "reinforcement learning"造句
  6. "reinforcement learning system"造句
  7. "reinforcement management"造句
  8. "reinforcement mat"造句
  9. "reinforcement material"造句
  10. "reinforcement measure"造句
桌面版繁體版English日本語

Copyright ? 2025 WordTech Co.